SQL Server 2008 : Developing Custom Managed Database Objects (part 5) - Developing Managed User-Defined Aggregates

3/31/2011 11:32:33 AM

Developing Managed User-Defined Aggregates (UDAs)

A highly specialized feature of SQL Server 2008, managed user-defined aggregates (UDAs) provide the capability to aggregate column data based on user-defined criteria built in to .NET code. You can now extend the (somewhat small) list of aggregate functions usable inside SQL Server to include those you custom-define.

The implementation contract for a UDA requires the following:

A static method called Init(), used to initialize any data fields in the struct, particularly the field that contains the aggregated value.
A static method called Terminate(), used to return the aggregated value to the UDA’s caller.
A static method called Aggregate(), used to add the value in the current row to the growing value.
A static method called Merge(), used when SQL Server breaks an aggregation task into multiple threads of execution (SQL Server actually uses a thread abstraction called a task), each of which needs to merge the value stored in its instance of the UDA with the growing value.

UDAs cannot do any data access, nor can they have any side-effects—meaning they cannot change the state of the database. They take only a single input parameter, of any type. You can also add public methods or properties other than those required by the contract (such as the IsPrime() method used in the following example).

Like UDTs, UDAs are structs. They are decorated with the SqlUserDefinedAggregate attribute, which has the following parameters for its constructor:

Format— Tells SQL Server how serialization (and its complement, deserialization) of the struct should be done. This has the same possible values and meaning as described earlier for SqlUserDefinedType.
A named parameter list— This list contains the following:
- IsInvariantToDuplicates—Tells SQL Server whether the UDA behaves differently with respect to duplicate values passed in from multiple rows.
- IsInvariantToNulls—Tells SQL Server whether the UDA behaves differently when null values are passed to it.
- IsInvariantToOrder—Tells SQL Server whether the UDA cares about the order in which column values are fed to it.
- IsNullIfEmpty—Tells SQL Server that the UDA will return null if its aggregated value is empty (that is, if its value is 0, or the empty string "", and so on).
- Name—Tells the deployment routine what to call the UDA when it is created in the database.
- MaxByteSize—Tells SQL Server not to allow more than the specified number of bytes to be held in an instance of the UDA. You must specify this when using Format.UserDefined.

For this example, you implement a very simple UDA that sums values in an integer column, but only if they are prime. Listing 8 shows the code to do this.

Listing 8. A UDA That Sums Prime Numbers

using System;
using System.Data;
using System.Data.Sql;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;

[Serializable]
[Microsoft.SqlServer.Server.SqlUserDefinedAggregate(
  Format.Native,
  IsInvariantToDuplicates=false,
  IsInvariantToNulls=true,
  IsInvariantToOrder=true,
  IsNullIfEmpty=true
)]
public struct SumPrime
{
  SqlInt64 Sum;

  private bool IsPrime(SqlInt64 Number)
  {
    for (int i = 2; i < Number; i++)
    {
      if (Number % i == 0)
      {
        return false;
       }
     }
     return true;
  }

  public void Init()
  {
    Sum = 0;
  }

  public void Accumulate(SqlInt64 Value)
  {
    if (!Value.IsNull && IsPrime(Value) && Value > 1)
      Sum += Value;
  }

  public void Merge(SumPrime Prime)
  {
    Sum += Prime.Sum;
  }

  public SqlInt64 Terminate()
  {
        return Sum;
  }
}

In this code, SQL Server first calls Init(), initializing the private Sum data field to 0.

For each column value passed to the aggregate, the Accumulate() method is called, wherein Sum is increased by the value of the column, if it is prime.

When multiple threads converge, Merge() is called, adding the values stored in each instance (as the Prime parameter) to Sum.

When the rowset has been completely parsed, SQL Server calls Terminate(), wherein the accumulated value Sum is returned.

Following are the results of testing SumPrime on Production.Product (an existing AdventureWorks2008 table):

SELECT TOP 10 dbo.SumPrime(p.ProductId) AS PrimeSum, p.Name
FROM Production.Product p
JOIN Production.WorkOrder o ON
o.ProductId = p.ProductId
WHERE Name LIKE '%Frame%'
GROUP BY p.ProductId, p.Name
ORDER BY PrimeSum DESC
go
PrimeSum  Name
--------------------------------------------
360355    HL Mountain Frame - Black, 42
338462    HL Mountain Frame - Silver, 42
266030    HL Road Frame - Red, 48
214784    HL Road Frame - Black, 48
133937    HL Touring Frame - Yellow, 46
68338     LL Road Frame - Red, 52
54221     LL Mountain Frame - Silver, 48
15393     ML Road Frame - Red, 52
0         HL Mountain Frame - Black, 38
0         HL Road Frame - Black, 44
(10 row(s) affected.)

Following is the DDL syntax for this UDA:

CREATE AGGREGATE SumPrime(@Number bigint)
RETURNS bigint
EXTERNAL NAME SQLCLR.SumPrime

As with UDTs, with UDAs there is no ALTER AGGREGATE, but you can use DROP AGGREGATE to drop them.

Related -----------------

- SQL Server 2008 : Developing Custom Managed Database Objects (part 7) - Using Transactions & Using the Related System Catalogs

- SQL Server 2008 : Developing Custom Managed Database Objects (part 6) - Developing Managed Triggers

- SQL Server 2008 : Developing Custom Managed Database Objects (part 4) - Developing Managed User-Defined Types

- SQL Server 2008 : Developing Custom Managed Database Objects (part 3) - Developing Managed User-Defined Functions

- SQL Server 2008 : Developing Custom Managed Database Objects (part 2) - Developing Managed Stored Procedures

- SQL Server 2008 : Developing Custom Managed Database Objects (part 1)

Other -----------------

- SQL Server 2008 : Profiler Usage Scenarios (part 2)

- SQL Server 2008 : Profiler Usage Scenarios (part 1) - Analyzing Slow Stored Procedures or Queries & Deadlocks

- SQL Server 2008 : Defining Server-Side Traces

- SQL Server 2008 : SQL Server Profiler - Replaying Trace Data

- SQL Server 2008 : SQL Server Profiler - Saving and Exporting Traces

- SQL Server 2008 : SQL Server Profiler - Creating Traces

- SQL Server 2008 : SQL Server Profiler Architecture

- SQL Server 2008: Administering Database Objects - Working with Tables (part 7) - Partitions

- SQL Server 2008: Administering Database Objects - Working with Tables (part 6) - Compression

- SQL Server 2008: Administering Database Objects - Working with Tables (part 5) - Sparse Columns

- SQL Server 2008: Administering Database Objects - Working with Tables (part 4) - Check Constraints

- SQL Server 2008: Administering Database Objects - Working with Tables (part 3) - Foreign Key Constraints

- SQL Server 2008: Administering Database Objects - Working with Tables (part 2) - Primary Key Constraints & Unique Constraints

- SQL Server 2008: Administering Database Objects - Working with Tables (part 1) - Default Constraints

- SQL Server 2008: Administering Database Objects - Working with Database Snapshots

- Programming with SQL Azure : WCF Data Services (part 3)

- Programming with SQL Azure : WCF Data Services (part 2) - Creating the Client Application

- Using XML in SQL Server 2008: Relational Data As XML - The FOR XML Modes (part 4) - EXPLICIT Mode

- Using XML in SQL Server 2008: Relational Data As XML - The FOR XML Modes (part 3) - AUTO Mode

- Programming with SQL Azure : WCF Data Services (part 1)